Sequencing trait-associated mutations to clone wheat rust-resistance gene YrNAM

Stripe (yellow) rust, caused by Puccinia striiformis f. sp. tritici (Pst), can significantly affect wheat production. Cloning resistance genes is critical for efficient and effective breeding of stripe rust resistant wheat cultivars. One resistance gene (Yr10CG) underlying the Pst resistance locus Yr10 has been cloned. However, following haplotype and linkage analyses indicate the presence of additional Pst resistance gene(s) underlying/near Yr10 locus. Here, we report the cloning of the Pst resistance gene YrNAM in this region using the method of sequencing trait-associated mutations (STAM). YrNAM encodes a non-canonical resistance protein with a NAM domain and a ZnF-BED domain. We show that both domains are required for resistance. Transgenic wheat harboring YrNAM gene driven by its endogenous promoter confers resistance to stripe rust races CYR32 and CYR33. YrNAM is an ancient gene and present in wild wheat species Aegilops longissima and Ae. sharonensis; however, it is absent in most wheat cultivars, which indicates its breeding value.

(see Figure 2, https://www.nature.com/articles/s41467-022-29132-8) and (ii) Wang et al. 2022 who cloned Lr9 by comparing RNAseq data of independently derived EMS mutants to the PacBio IsoSeq data of the wildtype non-mutated parent (see Figure 1, https://assets.researchsquare.com/files/rs-1807889/v1_covered.pdf?c=1657038907). The authors should give credit to the prior art in the literature and cite the two above studies appropriately. I would also suggest that the authors document their STAM pipeline (Figure 1a) in e.g. GitHub with precise descriptions of the software versions and parameters used at each step. 3. In lines 24-25 the authors state that the seven previously cloned Yr genes were cloned "mostly by map-based cloning" and then cite the review by Bouvet et al. 2022 to back up this statement. However, Bouvet and colleagues do not review how each of the seven individual Yr genes were cloned. A more appropriate citation would be to Hafeez et al. 2021 (https://doi.org/10.1016/j.molp.2021.05.014) who summarize the cloning method for each cloned resistance gene in wheat (see Supplementary Table 2). 4. In line 54 the authors report the exciting novel domain structure of YrNAM, i.e. containing NAM and ZnF-BED domains. Since ZnF-BED domains have been reported in other Poaceae resistance genes, i.e. wheat Yr5, Yr7, barley Rph15 and rice Xo1, the reader is left with the tantalizing question as to how the ZnF-BED domains might or might not be related across all these genes? One gets the feeling that the authors have missed a golden opportunity to conduct a phylogenetic analysis of BED domains as in Figure 3D of Marchal et al (https://www.nature.com/articles/s41477-018-0236-4) to establish the evolutionary relationship between R gene associated BED-domains in the Poaceae.
5. It doesn't really make sense to say that YrNAM "evolved from the Sitopsis" section (lines 89 and 104). The connotation of this statement is that YrNAM came from Sitopsis into wheat, i.e. through introgression. However, there is to my knowledge no report of Yr10 having been introgressed from a Sitopsis species into wheat. The alternate (and perhaps the more parsimonious) explanation of the findings of the phylogenetic analysis is that YrNAM and its orthologues in Ae. sharonensis and Ae. longissima share a common ancestor that predates the speciation of Ae. sharonensis and Ae. longissima. To support the prior statement of YrNAM "evolving from the Sitopsis" would require additional analysis of the YrNAM sequence and the haplotype in which it resides, and sequence comparison with the syntenic sequences in Ae. sharonensis and Ae. longissima.
6. Line 380: The authors state that the five Sitopsis species are "related to the B subgenome of common wheat (ref 19)". Reference 19 refers to "Ma, S. et al. Mol. Plant 14:1965Plant 14: -1968Plant 14: (2021." However, it is now commonly accepted that the Sitopsis species Ae. sharonensis, Ae. longissimi, Ae. bicornis and Ae. searsii are in fact not closely related to the B genome of common wheat -this distinction belongs to only Ae. speltoides within the Sitopsis, see e.g. Li, L. et al. Mol Plant 15:488-503 (2022), a paper which the authors also cite in their manuscript. Figure 4 is that YrNAM is absent from the Ae. bicornis and Ae. speltoides "species" (Lines 382-383). However, an alternate, and possibly more parsimonious scenario is that the species complexes of Ae. bicornis and Ae. speltoides display presence/absence polymorphisms for YrNAM, similar to what the authors found in common wheat. This alternate hypothesis needs to be stated. Moreover, I would recommend that the authors enrich and strengthen their analysis by including the three other recently reported Sitopsis genome assemblies derived from different accessions to the ones analysed in the YrNAM study. These

Minor suggestions:
Lines 81 to 84 -the authors describe the physical relationship between YrNAM and markers on chromosome 1B. It would help the reader if the authors provided a supplementary figure to support this description. Also, the terminology "relatively far from" is vague -please consider providing the actual physical and/or genetic distances.
Reviewer #3: Remarks to the Author: The study cloned a yellow rust resistance gene Yr10 in wheat through STAM (Sequencing Trait-Associated Mutations). Specifically, a resistant wildtype was used to build reference transcript sequences via Iso-seq and the candidate transcript was identified through analyzing regular RNA-seq of EMS mutants using reference transcript sequences. The candidate gene was supported by segregation analysis and confirmed with a transgenic experiment. Expression of the resistance allele in susceptible varieties strongly enhanced yellow rust resistance. Evidence provided by this study for the Yr10 cloning is strong. The strategy, STAM, for cloning is straightforward. Below are additional comments: The major contribution from this study is cloning of Yr10. One of the key experiments for the cloning success is the effort to secure more than 10 EMS mutants that lost resistance. STAM provides an alternative approach to using a reference genome that is available or built through de novo assembly. However, STAM has a risk when the gene is not expressed or expressed at a low level. Given the reduced cost to produce de novo assembly and the assembly quality is high, would STAM be a recommended approach for a new cloning project? Also, can those causal EMS mutations be identified if RNA-seq reads from mutants are mapped to a reference genome (e.g., the Chinese Spring reference genome) or genomic assembled contigs produced in this study? If not, reasoning to use STAM can be strengthened.
In general, the manuscript could be improved by precisely describing some genetic terms. For example, the resistance allele should be a dominant allele, which is not described; WT P10-46 can be clarified as "a resistant WT"; "All plants with a homozygous C1142T genotype were susceptible" could replace L69-70.
L18: Might be better to replace cisgenics with a more general term L49-50: It is not clear what data were used for de novo genome assembly. Also, how did those contigs end up with non-repetitive DNA sequences?
Thank you very much for allowing us to revise the manuscript. We have highlighted the revised content in the manuscript with track changes. In the manuscript, all changes were in response to reviewer comments, except that we revised & added sentences at the end of the abstract because our original abstract was shorter than necessary for Nature Communications. The revised and additional sentences at the end of the abstract state, "Using STAM with seven mutant wheat lines that have a non-functional wheat stripe rust resistance gene Yr10, we show that Yr10 encodes for a protein (named YrNAM) with a NAM and a ZnF-BED domain; based on 12 independent mutants, both domains are necessary for function. Using a vector that encodes for YrNAM with its endogenous promoter, we used stable transgenesis to experimentally demonstrate that  (1) and TuNAC69 (2), and ZnF-BED in Rhp15 (3), Xo1 (4, 5) and Yr7 (6). However, all 47 known R proteins in wheat lack a NAM domain (7), and some ZnF-BED-containing R proteins also have a canonical NLR domain. We now make this point explicitly in line 73-77. To our knowledge, we here reveal a novel R gene architecture with both NAM and ZnF-BED domains. If this novel architecture of a R gene was previously reported, we will cite any such work.

Major comments:
I-1. I do have a major issue with the identification of the new protein and strongly disagree as being potentially the Yr10 R gene for different reasons: The first one being that Yr10 has been shown a long time ago to be susceptible when challenged with isolate CDL-29 (Beaver and Powelson 1969). This isolate has been used by Line and Chen in numerous subsequent publications. This is the necessary assay to demonstrate unequivocally that the newly characterize gene is in fact, without any doubt, Yr10. Until this work is carried out, the proposed sequence cannot be identified as Yr10. Unless clearly demonstrated, this novel gene represents a new Yr R gene located on the distal portion of wheat chromosome 2BS for which the work was carefully documented.
Response: In hindsight, perhaps our description of previous work on Yr10CG was insufficient. We have now expanded this paragraph with more on the erroneous publication, and the identifying features of Yr10. The augmented text now reads, "Yr10, from a Turkish wheat 'PI 178383', confers all-stage resistance to Pst, is deployed in wheat cultivar 'Moro' 3 , and is present in AvSYr10NIL 4 . Wheat with Yr10 are resistant to multiple Pst strains including CYR29, CYR31, CYR32, and CYR33, but susceptible to CYR34 4-6 . Based on preliminary evidence, Laroche and colleagues proposed that a CC-NBS-LRR gene (Genbank AF149112) was a candidate for Yr10 7-8 . However, our haplotype and linkage analyses for AF149112, which we called Yr10CG, demonstrated that Yr10CG is not Yr10 9-10 ; for example, Yr10CG is 1.2 cM away from Yr10, and wheat lines that express AF149112 are susceptible to CYR29." As additional information to the reviewers, Yr10 is on 1BS, not on 2BS as indicated above. We reviewed the paper by Beaver and Powelson (8), and there is no mention of CDL-29 in that paper. Their only virulent race on Moro was W-57; perhaps W-57 and CDL-29 are the same. Because the reviewer states, "This isolate has been used by Line and Chen in numerous subsequent publications", we wrote to Dr. Xianming Chen (the reviewer's cited author), who is the most famous wheat stripe rust pathologist in the US and who maintains the best wheat stripe rust collection of US; we asked Dr. Chen about CDL-29 and W-57. According to Dr. Chen, the old race CDL-29 is equivalent to PSTv-23 (synonym, PST-29). CDL-29 is a fairly old race. We are trying to recover it from an over one-decade storage that were acquired from USA. Back on then, it was fairly easy to exchange races even without written permission. However, today, trying to import these races from the U.S. is not a reasonable strategy for us because the paperwork takes a long time and there's no guarantee that these strains would ever be approved for import. In addition, PSTv-23 (=CDL-29) should be avirulent to Yr6 (9), a seedling resistance gene that is in wheat cultivar 'Fielder' (9, 10); PSTv-23 is also avirulent to Yr9 (9), a seedling resistance gene that occurs in particular genotype of wheat 'CB037' (11). In our study, the genetic complementation was done in the Fielder and particular CB037 genetic backgrounds, which potentially precludes the use of PSTv-23 (=CDL-29) to corroborate Yr10.
However, Dr. Chen suggested that we use Chinese races, both Yr10-avirulent and Yr10virulent races, to make sure that we are using the correct lines for Yr10. We now have completed additional experiments and believe that we have fully done this. We have done one trial with Yr10-avirulent race CYR33 and two independent trials with Yr10virulent race CYR34. We now state on lines 94-99, "In T2 generation, PC1213 transgenic plants showed resistance to CYR33, an avirulent race for Yr10 as documented in Moro, P8-13, and P10-46 (Fig. 3a). However, CYR34 is virulent to Yr10 because it caused abundant sporulation in Moro, P8-13, and P10-46 (Fig. 3b).

I-2.
Secondly, non-specific PCR primers were used to probe and follow up Yr10cg (Yuan et al 2012). All PCR primer pairs tested in this publication and cited herein previous publications are targeting the LRR domain, a repetitive region, which always leads to identification of homologs. In addition, we know much better in the field now that the exact R gene sequences are critical for functionality and protecting the challenged plants by pathogen isolates. Divergence by only one nucleotide may be sufficient to change the specificity of any R gene. Based on their previous publications, the authors are well aware of this situation.
Response: We completely agree that the sequence of a R gene governs its specificity and functionality, and that, indeed, one needs extreme caution with primers that are in a repetitive region. However, the reviewer is incorrect about (presumably) our two diagnostic markers Yr10CGE2a and Yr10CGE2b (12), which were designed to detect the presence of AF149112 (Genbank) (13). To clarify, Yr10CGE2a and Yr10CGE2b (12) are specific for a CC-NBS-LRR gene associated with AF149112, which we named Yr10CG and is not Yr10 (12,14). Below, we illustrate why Yr10CGE2a and Yr10CGE2b are specific for AF149112.

Fig.1 Motif and marker region in the AF149112-corresponding protein
Yr10CGE2a and Yr10CGE2b match 676-799 aa and 698-769 aa, respectively. Thus, the reverse primers of both markers fall outside of the LRR domain, which was purposely designed for specificity for the AF149112 sequence. Although one PCR marker is often sufficient to correctly genotype a gene, because we were dealing with a CC-NBS-LRR sequence with multiple homologs in the wheat genome, we designed two markers and used both to avoid any confusion. In practice, Yr10CGE2a and Yr10CGE2b are very specific, and we only conclude that AF149112 is present when both markers were positive. In other cases, depending on the goals, the LRR domain alone was used to develop PCR makers, e.g. for Pm3a and Pm3f (15).
Second, PCR primers were designed by using AF149112-specific bases versus its close homologues ( Fig. S1 in (12)). The 3' end of each primer normally contain two or more continuous bases specific to AF149112. In this case, the reverse primer of Yr10CGE2a contains a 10bp insertion specific to AF149112 (Fig. S1 in (12)). Furthermore, we searched Yr10CGE2a (371 bp) in currently accessible wheat genomes, which contain far more sequence data than available in 2012 (12). As shown in Fig. 2A below, there are many hits for Yr10CGE2a in WheatOmics (http://wheatomics.sdau.edu.cn/), however the reverse primer (from 351 to 371 bp) of Yr10CGE2a is highly divergent among all close hits, except for two hits from 'Jagger' (chr. Un) and 'Norin 61' (chr. 1B) ( Fig. 2A).
We then searched the full-length sequence of AF149112 in 18 wheat genotypes (Fig.   2B); the best hits from Norin 61 (Fig. 2B) and Jagger (Fig. 2C) share more than 99.8% sequence identity versus AF149112, but the top hits from the other 16 genotypes share less than 94% sequence identity versus AF149112. Certainly, the AF149112 sequence occurs in Jagger and Norin 61, but not in other genotypes. Therefore, Yr10CGE2a is specific for the AF149112 fragment in wheat. From this study, we already knew that the YrNAM gene, which represents Yr10, is absent in all 18 wheat genotypes, although there are some >1kb fragments sharing a sequence identity up to 81% versus YrNAM.
The AF149112-corresponding sequence in Jagger is partially incomplete in the 5'-and However, the Yr10 donor germplasm PI178383 was first collected from Turkey in 1948 (18); the first Yr10 cultivar Moro was registered in 1966, with a pedigree of PI178383/Omar (19). Therefore, it is unlikely that there is a Yr10 gene in Norin 61 that is derived from PI178383 or Moro.
Third, The PCR specificity was tested on isogenic lines of 'Avocet S', 'Avocet R' and 'Avocet S+Yr10'. Both Yr10CGE2a and Yr10CGE2b only worked in Avocet S+Yr10, but not in Avocet S and Avocet R (Fig. S2 (12)). The specificity was further confirmed by sequencing the PCR products from Avocet S+Yr10, Moro, Nanda2419 and Jiangdongmen (12).
All together, we are extremely careful about the specificity of our primers, and are confident that Yr10CGE2a and Yr10CGE2b are indeed specific for the AF149112 fragment.

I-3. Sufficient information is provided to reproduce most of the work.
Response: Thank you!

Overall impression:
Ni and associates report on the cloning of the stripe rust resistance gene Yr10.
Remarkably, Yr10 encodes for a protein with NAM and ZnF-BED domains, thus constituting a novel domain architecture in all the nearly 300 plant disease resistance genes cloned to date. Yr10 was identified as a single candidate gene by comparing Response: Thanks for the constructive comments. We apologize for omitting them. We now cite both in Line 50: "Recently, similar approaches have been used to clone wheat rust resistance genes Lr9 11 and Sr62 12 ". As requested, we added precise descriptions on GitHub (https://github.com/Feiny/STAM), and added a note in Data availability. And we have changed "We invented" to "We developed" and "we introduce" to "we used". Response: Thanks for your valuable comment. We phylogenetically analyzed R gene associated BED-domains in Poaceae and now present the data in Supplemental Figure   8 and Supplemental Figure 9. We added the following sentences to Line 111-117: The Znf-BED domains were highly conserved among these YrNAM orthologs/homologs, especially among genes annotated in chromosomes 1A, 1B and 1S (Supplemental Fig. 8). In contrast, there was 37-44% identity between the YrNAM and the ZnF-BED domains of the six characterized NLR-BED genes for disease resistance in the Poaceae [13][14][15][16] . The phylogenetic tree analysis showed the ZnF-BED domain of YrNAM was separated from those ZnF-BED domains of NLR proteins in Poaceae (Supplemental Fig. 9).

II.5
It doesn't really make sense to say that YrNAM "evolved from the Sitopsis" section (lines 89 and 104). The connotation of this statement is that YrNAM came from Sitopsis into wheat, i.e. through introgression. However, there is to my knowledge no report of Yr10 having been introgressed from a Sitopsis species into wheat. The alternate (and perhaps the more parsimonious) explanation of the findings of the phylogenetic analysis is that YrNAM and its orthologues in Ae. sharonensis and Ae. longissima share a common ancestor that predates the speciation of Ae. sharonensis and Ae. longissima.
To support the prior statement of YrNAM "evolving from the Sitopsis" would require additional analysis of the YrNAM sequence and the haplotype in which it resides, and sequence comparison with the syntenic sequences in Ae. sharonensis and Ae.

II. 9 Minor suggestions:
Lines 81 to 84 -the authors describe the physical relationship between YrNAM and markers on chromosome 1B. It would help the reader if the authors provided a supplementary figure to support this description. Also, the terminology "relatively far from" is vague -please consider providing the actual physical and/or genetic distances.
Response: The supplemental Figure 3 has been added. We replaced "relatively far from" with "around 1.0 Mb". Addgene. In addition, upon request, we will provide the construct for research purposes at no charge. We added this note in the Data availability.
Response: All corrected.

III. Reviewer #3 (Remarks to the Author):
Overall impression: The study cloned a yellow rust resistance gene Yr10 in wheat through STAM (Sequencing Trait-Associated Mutations). Specifically, a resistant wildtype was used to build reference transcript sequences via Iso-seq and the candidate transcript was identified through analyzing regular RNA-seq of EMS mutants using reference transcript sequences. The candidate gene was supported by segregation analysis and confirmed with a transgenic experiment. Expression of the resistance allele in susceptible varieties strongly enhanced yellow rust resistance. Evidence provided by this study for the Yr10 cloning is strong. The strategy, STAM, for cloning is straightforward.
Response: Thank you! We are particularly grateful for your comments, "Evidence provided by this study for the Yr10 cloning is strong. The strategy, STAM, for cloning is straightforward." Major comments: III. 1 The major contribution from this study is cloning of Yr10. One of the key experiments for the cloning success is the effort to secure more than 10 EMS mutants that lost resistance. STAM provides an alternative approach to using a reference genome that is available or built through de novo assembly. However, STAM has a risk when the gene is not expressed or expressed at a low level. Given the reduced cost to produce de novo assembly and the assembly quality is high, would STAM be a recommended approach for a new cloning project? Also, can those causal EMS mutations be identified if RNA-seq reads from mutants are mapped to a reference genome (e.g., the Chinese Spring reference genome) or genomic assembled contigs produced in this study? If not, reasoning to use STAM can be strengthened.
Response: Thanks! The reviewer raises interesting and important questions about STAM technology. At the end of the last paragraph, we have added the following text, "For cloning Yr10, STAM was a low-cost option; YrNAM is absent in assembled wheat genomes; de novo genomic assemblies of complex, polyploid genomes such as wheat are still expensive; and, in general, independent, EMS mutants in wheat are relatively easy to obtain, particularly when the trait is encoded by a single gene. STAM may not be the best option for identifying genes for polygenic traits, and is unsuitable for identifying genes with mutations in regulatory parts of a gene. We note that we identified Yr10 with STAM even though Yr10 apparently, had low expression, which might be due to highly localized expression, with only 8 circular consensus sequencing

III.2
In general, the manuscript could be improved by precisely describing some genetic terms. For example, the resistance allele should be a dominant allele, which is not described; WT P10-46 can be clarified as "a resistant WT"; "All plants with a homozygous C1142T genotype were susceptible"could replace L69-70.
Response: Thanks! We modified the Line 128, "Yr10 has a previously unidentified architecture" was replaced by "The dominant resistance allele has a previously unidentified structure". WT P10-46 was clarified as "a resistant WT" in Line 52, 82, 105, 160, 278, and 391. Line 83-84 has been replaced by "All plants with a homozygous C1142T genotype were susceptible".

III.3 L18:
Might be better to replace cisgenics with a more general term.